Skip to content

feat(rest-api): Support for adding Tenant information when reported Issue for Delete Instance#2829

Open
hwadekar-nv wants to merge 10 commits into
NVIDIA:mainfrom
hwadekar-nv:fix/tenant-deleted-issue
Open

feat(rest-api): Support for adding Tenant information when reported Issue for Delete Instance#2829
hwadekar-nv wants to merge 10 commits into
NVIDIA:mainfrom
hwadekar-nv:fix/tenant-deleted-issue

Conversation

@hwadekar-nv

@hwadekar-nv hwadekar-nv commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

As detailed in the description provided here: #2107

This PR aims to address the following issues:

  1. When a Tenant Repair Issue is present, the ReleaseInstance call should include Tenant and User information as a JSON string.
  2. A unit test has been added to verify that the ReleaseInstance request contains Tenant and User information when a Tenant Repair Issue is present.

Type of Change

  • Add - New feature or capability

Testing

  • Unit tests added/updated

@hwadekar-nv hwadekar-nv self-assigned this Jun 23, 2026
@hwadekar-nv hwadekar-nv requested a review from a team as a code owner June 23, 2026 23:37
@hwadekar-nv hwadekar-nv force-pushed the fix/tenant-deleted-issue branch from 0fc7298 to b086bab Compare June 23, 2026 23:38
@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

The request proto gains delete attribution fields, the REST API populates them from user and tenant context, and the instance release path logs them. Call sites and tests are updated to send the new optional field.

Changes

Delete Attribution Flow

Layer / File(s) Summary
Proto request and attribution messages
crates/rpc/proto/forge.proto
Adds DeleteInitiatedBy and DeleteAttribution, and extends InstanceReleaseRequest with optional delete_attribution.
REST delete request serialization
rest-api/api/pkg/api/model/instance.go, rest-api/api/pkg/api/model/instance_test.go
ToProto now accepts the caller user and builds DeleteAttribution from instance tenant and user IDs; tests assert the new attribution fields alongside issue mapping cases.
Delete handler workflow wiring
rest-api/api/pkg/api/handler/instance.go, rest-api/api/pkg/api/handler/instance_test.go
The delete handler passes the user into ToProto, and the handler test validates status history, the recorded workflow request’s delete attribution, and the mapped machine-health issue data.
API core request propagation and logging
crates/admin-cli/src/instance/release/cmd.rs, crates/api-core/src/handlers/instance.rs, crates/api-core/src/tests/common/api_fixtures/instance.rs, crates/api-core/src/tests/instance.rs, crates/api-core/src/tests/instance_config_update.rs
Client and test release requests now set delete_attribution: None, the API-core release handler logs attribution when present, and instance lifecycle tests are updated to use the expanded request shape.
NSG attachment test updates
crates/api-core/src/tests/network_security_group.rs
The NSG attachment test switches instance setup, updates attachment assertions, and sends delete_attribution: None on release.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title is clearly related to the delete-instance attribution change, even if phrasing is awkward.
Description check ✅ Passed The description matches the PR’s goal of including tenant and user information during ReleaseInstance for tenant repair issues.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@github-actions

github-actions Bot commented Jun 23, 2026

Copy link
Copy Markdown

🔍 Container Scan Summary

Service Total Critical High Medium Low Other
boot-artifacts-aarch64 3 0 0 3 0 0
boot-artifacts-x86_64 3 0 0 3 0 0
forge-admin-cli-x86_64 285 6 26 102 7 144
machine-validation-runner 744 32 188 267 36 221
machine_validation 744 32 188 267 36 221
machine_validation-aarch64 744 32 188 267 36 221
nvmetal-carbide 744 32 188 267 36 221
TOTAL 3267 134 778 1176 151 1028

Per-CVE detail lives in the per-service grype-* artifacts (JSON + SARIF). Severity counts only — no CVE IDs published here.

@github-actions

Copy link
Copy Markdown

@github-actions

Copy link
Copy Markdown

🔐 TruffleHog Secret Scan

No secrets or credentials found!

Your code has been scanned for 700+ types of secrets and credentials. All clear! 🎉

🔗 View scan details

🕐 Last updated: 2026-06-23 23:40:21 UTC | Commit: 0fc7298

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@rest-api/api/pkg/api/handler/instance_test.go`:
- Around line 9726-9750: Replace the conditional guard that checks if tsc.Calls
has elements and if Arguments has sufficient length with mandatory require
statements. Use require.NotEmpty(t, tsc.Calls) to ensure the test fails if calls
are missing when MachineHealthIssue is present, then safely access the Arguments
array without the length check. This makes the workflow-attribution assertions
mandatory rather than allowing them to be silently skipped, ensuring regressions
are caught deterministically.

In `@rest-api/api/pkg/api/handler/instance.go`:
- Around line 4906-4909: The error handling block in the transaction closure
(where derr is checked) is returning cutil.NewAPIErrorResponse(c,
http.StatusInternalServerError, "Failed to delete Instance", nil), but inside
WithTx closures you should return cutil.NewAPIError instead and let the outer
handler translate it after transaction unwind. Replace the
cutil.NewAPIErrorResponse call with cutil.NewAPIError, removing the context
parameter c and keeping only the status code, message, and data parameters.

In `@rest-api/api/pkg/api/model/instance_test.go`:
- Around line 2688-2695: The test currently only validates top-level keys in the
parsed JSON structure but does not check the nested structure of tenant_reported
to ensure the omitempty contract is enforced for the details field. After the
existing assertions that validate wantRawKeys and wantAbsentKeys at the top
level, add additional assertions that unmarshal the tenant_reported object from
the parsed map and verify that the details field is properly absent (not null)
when it should be omitted according to the omitempty semantics for nil or empty
detail cases.

In `@rest-api/api/pkg/api/model/instance.go`:
- Around line 1651-1678: The ToInstanceDeleteAttributionConfig method serializes
the config to JSON and returns it as a string, but does not validate that the
resulting serialized payload respects the DeleteAttributionIssueDetailsMaxLen
size contract. After the json.Marshal call succeeds and before the return
statement, add a length validation check to ensure the serialized string data
does not exceed DeleteAttributionIssueDetailsMaxLen. If the length exceeds the
limit, return an empty string and an appropriate error to prevent cross-service
contract violations when this attribution payload is used as Issue.Details
elsewhere.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: bc6cda04-ca4d-4afe-9d9b-12145bc8d98f

📥 Commits

Reviewing files that changed from the base of the PR and between 8f81824 and b086bab.

📒 Files selected for processing (4)
  • rest-api/api/pkg/api/handler/instance.go
  • rest-api/api/pkg/api/handler/instance_test.go
  • rest-api/api/pkg/api/model/instance.go
  • rest-api/api/pkg/api/model/instance_test.go

Comment thread rest-api/api/pkg/api/handler/instance_test.go Outdated
Comment thread rest-api/api/pkg/api/handler/instance.go Outdated
Comment on lines +2688 to +2695
var parsed map[string]json.RawMessage
require.NoError(t, json.Unmarshal([]byte(raw), &parsed))
for _, key := range tt.wantRawKeys {
assert.Contains(t, parsed, key)
}
for _, key := range tt.wantAbsentKeys {
assert.NotContains(t, parsed, key)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Add nested raw-key assertions for tenant_reported.details omission semantics.

The current raw JSON check only validates top-level keys. For the nil/empty details cases, the test should also assert that tenant_reported.details is absent (not null) to fully lock the omitempty contract.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rest-api/api/pkg/api/model/instance_test.go` around lines 2688 - 2695, The
test currently only validates top-level keys in the parsed JSON structure but
does not check the nested structure of tenant_reported to ensure the omitempty
contract is enforced for the details field. After the existing assertions that
validate wantRawKeys and wantAbsentKeys at the top level, add additional
assertions that unmarshal the tenant_reported object from the parsed map and
verify that the details field is properly absent (not null) when it should be
omitted according to the omitempty semantics for nil or empty detail cases.

Comment thread rest-api/api/pkg/api/model/instance.go Outdated
Comment on lines +1651 to +1678
// ToInstanceDeleteAttributionConfig builds the delete attribution config from the API request.
func (idr *APIInstanceDeleteRequest) ToInstanceDeleteAttributionConfig(user *cdbm.User, instance *cdbm.Instance) (string, error) {
config := InstanceDeleteAttributionConfig{
InitiatedBy: InstanceDeleteInitiatedBy{
Org: instance.Tenant.Org,
UserID: user.ID.String(),
TenantID: instance.Tenant.ID.String(),
TenantOrg: instance.Tenant.Org,
},
}
if idr.MachineHealthIssue != nil {
mhi := idr.MachineHealthIssue
reported := InstanceDeleteAttributionTenantReported{
Category: strings.ToUpper(mhi.Category),
}
if mhi.Summary != nil {
reported.Summary = *mhi.Summary
}
if mhi.Details != nil && *mhi.Details != "" {
reported.Details = mhi.Details
}
config.TenantReported = &reported
}
data, err := json.Marshal(config)
if err != nil {
return "", err
}
return string(data), nil

@coderabbitai coderabbitai Bot Jun 23, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Enforce the Issue.Details size contract before returning the serialized attribution payload.

At Line 1674, the serialized JSON can exceed DeleteAttributionIssueDetailsMaxLen even when request fields pass validation (Summary and Details can each be 1024). That creates a cross-service contract risk when this string is forwarded as Issue.Details.

💡 Proposed fix
 data, err := json.Marshal(config)
 if err != nil {
 	return "", err
 }
+if len(data) > DeleteAttributionIssueDetailsMaxLen {
+	return "", fmt.Errorf("delete attribution payload exceeds %d bytes", DeleteAttributionIssueDetailsMaxLen)
+}
 return string(data), nil
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rest-api/api/pkg/api/model/instance.go` around lines 1651 - 1678, The
ToInstanceDeleteAttributionConfig method serializes the config to JSON and
returns it as a string, but does not validate that the resulting serialized
payload respects the DeleteAttributionIssueDetailsMaxLen size contract. After
the json.Marshal call succeeds and before the return statement, add a length
validation check to ensure the serialized string data does not exceed
DeleteAttributionIssueDetailsMaxLen. If the length exceeds the limit, return an
empty string and an appropriate error to prevent cross-service contract
violations when this attribution payload is used as Issue.Details elsewhere.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hwadekar-nv This one seems important?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Comment thread rest-api/api/pkg/api/model/instance.go Outdated

// InstanceDeleteInitiatedBy records who initiated an Instance delete in the cloud API layer.
type InstanceDeleteInitiatedBy struct {
Org string `json:"org"`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between org and tenant_org?

Comment thread rest-api/api/pkg/api/model/instance.go Outdated
}

// InstanceDeleteAttributionTenantReported captures tenant-reported issue fields included in delete attribution.
type InstanceDeleteAttributionTenantReported struct {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread rest-api/api/pkg/api/model/instance.go Outdated
// InstanceDeleteAttributionConfig is the canonical JSON payload persisted on delete and forwarded to Site when an issue is reported.
type InstanceDeleteAttributionConfig struct {
InitiatedBy InstanceDeleteInitiatedBy `json:"initiated_by"`
TenantReported *InstanceDeleteAttributionTenantReported `json:"tenant_reported,omitempty"`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we already send this information when Tenant specifies it? Seems like we're duplicating the information?

Comment thread rest-api/api/pkg/api/model/instance.go Outdated
}

// ToInstanceDeleteAttributionConfig builds the delete attribution config from the API request.
func (idr *APIInstanceDeleteRequest) ToInstanceDeleteAttributionConfig(user *cdbm.User, instance *cdbm.Instance) (string, error) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should call this GetTenantAttributionDetails. Also let's pass user and tenant as argument, we don't seem to use Instance for anything other than accessing Tenant.

@sunilkumar-nvidia

Copy link
Copy Markdown
Contributor

I’d recommend moving delete attribution out of Issue.details and into a new optional delete_attribution field on InstanceReleaseRequest. That keeps details backward-compatible as tenant-provided issue text, works for both issue and no-issue delete flows, and lets Core store attribution as a separate structured field instead of parsing escaped JSON from details. For all-delete audit coverage, the API should populate this field on every delete and also emit/persist structured audit metadata cloud-side.

Something like:
Add a new optional proto field on InstanceReleaseRequest, outside issue:

`
message DeleteInitiatedBy {
string org = 1;
string user_id = 2;
string tenant_id = 3;
string tenant_org = 4;
}

message DeleteAttribution {
DeleteInitiatedBy initiated_by = 1;
}

message InstanceReleaseRequest {
common.InstanceId id = 1;
optional Issue issue = 2;
optional bool is_repair_tenant = 3;
optional DeleteAttribution delete_attribution = 4;
}`

@hwadekar-nv

Copy link
Copy Markdown
Contributor Author

Thanks @thossain-nv @sunilkumar-nvidia, while implementing this, I also felt to have separate field in proto instead using Issue.details, if okey, will add that into proto?

@hwadekar-nv hwadekar-nv force-pushed the fix/tenant-deleted-issue branch 2 times, most recently from 4c481b5 to 1b19e2c Compare June 25, 2026 20:29
@hwadekar-nv

Copy link
Copy Markdown
Contributor Author

@thossain-nv @sunilkumar-nvidia Added deletionAttribute in proto, and made chnages in both rest-api as well as CORE, please review when possible

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@rest-api/api/pkg/api/handler/instance_test.go`:
- Around line 9746-9763: The release request assertions in the instance test are
reading from the wrong mock, so they can validate a stale call from another
subtest. Update this block to assert against the mock instance actually used by
the current case (for example, the current test’s release-client mock such as
tsc or tscWithNICoNotFound depending on the path), and fetch the latest
InstanceReleaseRequest from that mock before checking DeleteAttribution and
Issue fields. Ensure the assertions are tied to the current subtest setup rather
than always using tsc.Calls.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: dd004154-dd2f-438b-8d87-6662b273fbdb

📥 Commits

Reviewing files that changed from the base of the PR and between b086bab and 1b19e2c.

⛔ Files ignored due to path filters (7)
  • rest-api/workflow-schema/schema/site-agent/workflows/v1/fmds_nico_grpc.pb.go is excluded by !**/*.pb.go, !rest-api/**/*.pb.go, !rest-api/**/*_grpc.pb.go
  • rest-api/workflow-schema/schema/site-agent/workflows/v1/nico_nico.pb.go is excluded by !**/*.pb.go, !rest-api/**/*.pb.go
  • rest-api/workflow-schema/schema/site-agent/workflows/v1/nico_nico_grpc.pb.go is excluded by !**/*.pb.go, !rest-api/**/*.pb.go, !rest-api/**/*_grpc.pb.go
  • rest-api/workflow-schema/schema/site-agent/workflows/v1/nmx_c_nico_grpc.pb.go is excluded by !**/*.pb.go, !rest-api/**/*.pb.go, !rest-api/**/*_grpc.pb.go
  • rest-api/workflow-schema/schema/site-agent/workflows/v1/site_explorer_nico.pb.go is excluded by !**/*.pb.go, !rest-api/**/*.pb.go
  • rest-api/workflow-schema/site-agent/workflows/v1/nico_nico.proto is excluded by !rest-api/workflow-schema/site-agent/workflows/v1/*_nico.proto
  • rest-api/workflow-schema/site-agent/workflows/v1/site_explorer_nico.proto is excluded by !rest-api/workflow-schema/site-agent/workflows/v1/*_nico.proto
📒 Files selected for processing (11)
  • crates/admin-cli/src/instance/release/cmd.rs
  • crates/api-core/src/handlers/instance.rs
  • crates/api-core/src/tests/common/api_fixtures/instance.rs
  • crates/api-core/src/tests/instance.rs
  • crates/api-core/src/tests/instance_config_update.rs
  • crates/api-core/src/tests/network_security_group.rs
  • crates/rpc/proto/forge.proto
  • rest-api/api/pkg/api/handler/instance.go
  • rest-api/api/pkg/api/handler/instance_test.go
  • rest-api/api/pkg/api/model/instance.go
  • rest-api/api/pkg/api/model/instance_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • rest-api/api/pkg/api/handler/instance.go

Comment thread rest-api/api/pkg/api/handler/instance_test.go Outdated
@hwadekar-nv hwadekar-nv force-pushed the fix/tenant-deleted-issue branch 6 times, most recently from 83192f4 to a52d11f Compare June 26, 2026 20:41
@github-actions

Copy link
Copy Markdown

🔐 TruffleHog Secret Scan

No secrets or credentials found!

Your code has been scanned for 700+ types of secrets and credentials. All clear! 🎉

🔗 View scan details

🕐 Last updated: 2026-06-26 20:44:11 UTC | Commit: a52d11f

@hwadekar-nv hwadekar-nv force-pushed the fix/tenant-deleted-issue branch from a52d11f to 70be37c Compare June 26, 2026 22:11
…c v1.6.2

Align generated flow_grpc.pb.go with CI buf generate output so the
check-protobuf-generated job passes.

Signed-off-by: Hitesh Wadekar <hwadekar@nvidia.com>
Fix protoc-gen-go comment alignment drift in DNSMessage_DNSQuestion so
check-protobuf-generated passes on linux-amd64 CI runners.

Signed-off-by: Hitesh Wadekar <hwadekar@nvidia.com>
@hwadekar-nv hwadekar-nv force-pushed the fix/tenant-deleted-issue branch from 70be37c to 38e8804 Compare June 26, 2026 23:40
Sync workflow-schema and flow nicoapi generated Go from current Core proto
definitions, including Site Explorer last-run metadata RPC additions.
…v1.6.1

Regenerate workflow-schema gRPC stubs and nico.pb.go DNS comment formatting
using the same plugin versions CI installs (protoc-gen-go-grpc v1.6.1).
@sunilkumar-nvidia

sunilkumar-nvidia commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

@thossain-nv @sunilkumar-nvidia Added deletionAttribute in proto, and made chnages in both rest-api as well as CORE, please review when possible

The core logic looks good now. Thanks @hwadekar-nv for addressing the feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants